Index and Materialized View Selection in Data Warehouses

نویسندگان

  • Kamel Aouiche
  • Jérôme Darmont
چکیده

Database management systems (DBMSs) require an administrator whose principal tasks are data management, both at the logical and physical levels, as well as performance optimization. With the wide development of databases and data warehouses, minimizing the administration function is crucial. This function includes the selection of suitable physical structures to improve system performance. View materialization and indexing are presumably some of the most effective optimization techniques adopted in relational implementations of data warehouses. Materialized views are physical structures that improve data access time by precomputing intermediary results. Therefore, end-user queries can be efficiently processed through data stored in views and do not need to access the original data. Indexes are also physical structures that allow direct data access. They avoid sequential scans and thereby reduce query response time. Nevertheless, these solutions require additional storage space and entail maintenance overhead. The issue is then to select an appropriate configuration of materialized views and indexes that minimizes both query response time and maintenance cost given a limited storage space. This problem is NP hard (Gupta & Mumick, 2005). The aim of this article is to present an overview of the major families of state-of-the-art index and materialized view selection methods, and to discuss the issues and future trends in data warehouse performance optimization. We particularly focus on data-mining-based heuristics we developed to reduce the selection problem complexity and target the most pertinent candidate indexes and materialized views.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Relational Database Constraints to Design Materialized Views in Data Warehouses

Queries to data warehouses often involve hundreds of complex aggregations over large volumes of data, and so it is infeasible to compute these queries by scanning the data sources each time. Data warehouses therefore build a large number of materialized views to increase system performance. However, materialized views need to be immediately updated when its sources are changed, leading to a pos...

متن کامل

Rewriting OLAP Queries Using Materialized Views and Dimension Hierarchies in Data Warehouses

OLAP queries involve a lot of aggregations on a large amount of data in data warehouses. To process expensive OLAP queries efficiently, we propose a new method for rewriting a given OLAP query using various kinds of materialized aggregate views which already exist in data warehouses. We first define the normal forms of OLAP queries and materialized views based on the lattice of dimension hierar...

متن کامل

Practical Approach to Selecting Data Warehouse Views Using Data Dependencies

Data warehouses integrate information from heterogeneous sources and enable e cient analysis of the information. The two main characteristics of data warehouses are the huge volumes of data they store and the requirement of fast access to the data. Because of the huge volumes of data, simple search techniques are not su cient. Materialized views in data warehouses are typically complicated, bas...

متن کامل

TSGV: a table-like structure-based greedy method for materialized view selection in data warehouses

Since a data warehouse deals with huge amounts of data and complex analytical queries, online processing and answering to users’ queries in data warehouses can be a serious challenge. Materialized views are used to speed up query processing rather than direct access to the database in on-line analytical processing. Since the large number and high volume of views prevents all of the views from b...

متن کامل

Materialized View Selection by Query Clustering in XML Data Warehouses

XML data warehouses form an interesting basis for decision-support applications that exploit complex data. However, native XML database management systems currently bear limited performances and it is necessary to design strategies to optimize them. In this paper, we propose an automatic strategy for the selection of XML materialized views that exploit a data mining technique, more precisely th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1701.08029  شماره 

صفحات  -

تاریخ انتشار 2015